Text Classification for a Large-Scale Taxonomy Using Dynamically Mixed Local and Global Models for a Node

نویسندگان

  • Heung-Seon Oh
  • Yoonjung Choi
  • Sung-Hyon Myaeng
چکیده

Hierarchical text classification for a large-scale Web taxonomy is challenging because the number of categories hierarchically organized is large and the training data for deep categories are usually sparse. It’s been shown that a narrow-down approach involving a search of the taxonomical tree is an effective method for the problem. A recent study showed that both local and global information for a node is useful for further improvement. This paper introduces two methods for mixing local and global models dynamically for individual nodes and shows they improve classification effectiveness by 5% and 30%, respectively, over and above the state-of-art method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A General Investigation on the Combination of Local and Global Feature Selection Methods for Request Identification in Telegram

Nowadays, the use of various messaging services is expanding worldwide with the rapid development of Internet technologies. Telegram is a cloud-based open-source text messaging service. According to the US Securities and Exchange Commission and based on the statistics given for October 2019 to present, 300 million people worldwide used telegram per month. Telegram users are more concentrated in...

متن کامل

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches

Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...

متن کامل

Utilizing global and path information with language modelling for hierarchical text classification

Hierarchical text classification of a Web taxonomy is challenging because it is a very large-scale problem with hundreds of thousand categories and associated documents. Furthermore, the conceptual levels and training data availabilities of categories vary widely. The narrow-down approach is the state-of-the-art that utilizes a search engine for generating candidates from the taxonomy and build...

متن کامل

A novel method for locating the local terrestrial laser scans in a global aerial point cloud

In addition to the heterogeneity of aerial and terrestrial views, the small scale terrestrial point clouds are hardly comparable with large scale and overhead aerial point clouds. A hierarchical method is proposed for automatic locating of terrestrial scans in aerial point cloud. The proposed method begins with detecting the candidate positions for the deployment of the terrestrial laser scanne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011